Goto

Collaborating Authors

 drug target


Retrieve to Explain: Evidence-driven Predictions with Language Models

Patel, Ravi, Brayne, Angus, Hintzen, Rogier, Jaroslawicz, Daniel, Neculae, Georgiana, Corneil, Dane

arXiv.org Artificial Intelligence

Machine learning models, particularly language models, are notoriously difficult to introspect. Black-box models can mask both issues in model training and harmful biases. For human-in-the-loop processes, opaque predictions can drive lack of trust, limiting a model's impact even when it performs effectively. To address these issues, we introduce Retrieve to Explain (R2E). R2E is a retrieval-based language model that prioritizes amongst a pre-defined set of possible answers to a research question based on the evidence in a document corpus, using Shapley values to identify the relative importance of pieces of evidence to the final prediction. R2E can adapt to new evidence without retraining, and incorporate structured data through templating into natural language. We assess on the use case of drug target identification from published scientific literature, where we show that the model outperforms an industry-standard genetics-based approach on predicting clinical trial outcomes.


A Data-driven Latent Semantic Analysis for Automatic Text Summarization using LDA Topic Modelling

Onah, Daniel F. O., Pang, Elaine L. L., El-Haj, Mahmoud

arXiv.org Artificial Intelligence

With the advent and popularity of big data mining and huge text analysis in modern times, automated text summarization became prominent for extracting and retrieving important information from documents. This research investigates aspects of automatic text summarization from the perspectives of single and multiple documents. Summarization is a task of condensing huge text articles into short, summarized versions. The text is reduced in size for summarization purpose but preserving key vital information and retaining the meaning of the original document. This study presents the Latent Dirichlet Allocation (LDA) approach used to perform topic modelling from summarised medical science journal articles with topics related to genes and diseases. In this study, PyLDAvis web-based interactive visualization tool was used to visualise the selected topics. The visualisation provides an overarching view of the main topics while allowing and attributing deep meaning to the prevalence individual topic. This study presents a novel approach to summarization of single and multiple documents. The results suggest the terms ranked purely by considering their probability of the topic prevalence within the processed document using extractive summarization technique. PyLDAvis visualization describes the flexibility of exploring the terms of the topics' association to the fitted LDA model. The topic modelling result shows prevalence within topics 1 and 2. This association reveals that there is similarity between the terms in topic 1 and 2 in this study. The efficacy of the LDA and the extractive summarization methods were measured using Latent Semantic Analysis (LSA) and Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metrics to evaluate the reliability and validity of the model.


Big data and AI meet cancer research

#artificialintelligence

Many cancer patients undergo treatment with multiple drugs, each of which attacks cancer in a different way, so the combination fights cancer on many fronts. But more drugs mean higher risks of side effects. "Most cancer therapy is now a combination treatment," says Avinash (Avi) Sahu, Ph.D., assistant professor at The University of New Mexico Comprehensive Cancer Center. Sahu joined UNM from Harvard and Dana-Farber Cancer Institute. "We wanted to find drugs that could suppress two cancer-causing pathways at the same time."


Drug-target affinity prediction method based on consistent expression of heterogeneous data

Liu, Boyuan

arXiv.org Artificial Intelligence

The first step in drug discovery is finding drug molecule moieties with medicinal activity against specific targets. Therefore, it is crucial to investigate the interaction between drug-target proteins and small chemical molecules. However, traditional experimental methods for discovering potential small drug molecules are labor-intensive and time-consuming. There is currently a lot of interest in building computational models to screen small drug molecules using drug molecule-related databases. In this paper, we propose a method for predicting drug-target binding affinity using deep learning models. This method uses a modified GRU and GNN to extract features from the drug-target protein sequences and the drug molecule map, respectively, to obtain their feature vectors. The combined vectors are used as vector representations of drug-target molecule pairs and then fed into a fully connected network to predict drug-target binding affinity. This proposed model demonstrates its accuracy and effectiveness in predicting drug-target binding affinity on the DAVIS and KIBA datasets.


Verge Genomics takes AI-sourced drug for ALS into clinic

#artificialintelligence

Verge Genomics has joined a select group of biotechs who have taken a drug discovered and developed using artificial intelligence into human testing. The small-molecule PIKfyve inhibitor – called VRG50635 – has been administered to the first subject in the phase 1 trial involving healthy volunteers, according to the San Francisco-based biotech, which was founded in 2015 by Alice Zhang and Jason Chen. VRG50635 was discovered using Verge's AI-powered discovery platform ConVERGE which maps out the biological underpinnings of diseases using data on DNA, RNA, and protein profiles to identify new targets and drugs that can interact with them. The company focuses on diseases of the central nervous system, starting with neurodegenerative diseases amyotrophic lateral sclerosis (ALS) – the indication for VRG50635 – and Parkinson's disease. PIKfyve is an enzyme thought to be involved in an underlying disease process in ALS linked to the function of lysosomes, organelles involved in processing waste materials in cells.


Speeding up drug discovery with advanced machine learning

#artificialintelligence

Whatever our job title happens to be at AstraZeneca, we're seekers. We help scientists comb through massive amounts of data in our quest to find the information we need to help us deliver life-changing medicines. AstraZeneca is a research-based biopharmaceutical company headquartered in Cambridge, UK, with strategic research and development (R&D) centers in Sweden, the United Kingdom and the United States. The company has a broad portfolio of prescription medicines, primarily for the treatment of diseases in three therapy areas -- Oncology; Cardiovascular, Renal & Metabolism; and Respiratory & Immunology. At AstraZeneca, our drug discovery and development is guided by our '5R Framework': right target, right patient, right tissue, right safety, right commercial potential.


Bayesian tensor factorization for predicting clinical outcomes using integrated human genetics evidence

Soylemez, Onuralp

arXiv.org Artificial Intelligence

The approval success rate of drug candidates is very low with the majority of failure due to safety and efficacy. Increasingly available high dimensional information on targets, drug molecules and indications provides an opportunity for ML methods to integrate multiple data modalities and better predict clinically promising drug targets. Notably, drug targets with human genetics evidence are shown to have better odds to succeed. However, a recent tensor factorization-based approach found that additional information on targets and indications might not necessarily improve the predictive accuracy. Here we revisit this approach by integrating different types of human genetics evidence collated from publicly available sources to support each target-indication pair. We use Bayesian tensor factorization to show that models incorporating all available human genetics evidence (rare disease, gene burden, common disease) modestly improves the clinical outcome prediction over models using single line of genetics evidence. We provide additional insight into the relative predictive power of different types of human genetics evidence for predicting the success of clinical outcomes.


GNS and the Global Alzheimer's Platform Foundation Partner to advance Alzheimer's R&D - GNS

#artificialintelligence

Somerville, MA, June 1st, 2022: GNS, the leader in the use of "Virtual Patients," Causal AI and simulation technology for biopharmaceutical drug discovery and development, and the Global Alzheimer's Platform Foundation (GAP) today announced a 3-year partnership. This innovative partnership will leverage the fully de-identified dataset of rich clinico-genomic data from GAP's Bio-Hermes study to build the next generation Gemini Virtual Patient in Alzheimer's Disease (AD). The data from the groundbreaking Bio-Hermes study includes samples from more than 1,000 volunteer participants. This is the first Alzheimer's platform study to prioritize diversity in the study protocol. It is the largest AD trial of its kind evaluating biomarkers, surrogate markers, and cognitive tests.


Artificial Intelligence at the Heart of China's New Drug Discovery

#artificialintelligence

It's not much of a surprise to find Artificial Intelligence (AI) playing a central role in the pharmaceutical industry. Chinese firms are relying on AI to put more drugs on the market, and by extrapolation extend better services. The country is gathering momentum for an artificial intelligence-backed drug discovery boom. All thanks to the nation's emphasis on innovation-driven development, these companies are going through a continuously improving innovation ecosystem, according to industry experts and business leaders. "It is not a question of whether China will become a powerhouse in AI-driven drug development even though it started relatively late (in the field). The only question is when that will happen." said an Industry Leader in AI-based Drug Discovery.


Machine Learning-Enabled Pipeline for Large-Scale Virtual Drug Screening

#artificialintelligence

Virtual screening is receiving renewed attention in drug discovery, but progress is hampered by challenges on two fronts: handling the ever-increasing sizes of libraries of drug-like compounds and separating true positives from false positives. Here, we developed a machine learning-enabled pipeline for large-scale virtual screening that promises breakthroughs on both fronts. By clustering compounds according to molecular properties and limited docking against a drug target, the full library was trimmed by 10-fold; the remaining compounds were then screened individually by docking; and finally, a dense neural network was trained to classify the hits into true and false positives. As illustration, we screened for inhibitors against RPN11, the deubiquitinase subunit of the proteasome, and a drug target for breast cancer.